Inferring Urban Land Use from Satellite Sensor Images Using Kernel-Based Spatial Reclassification
نویسندگان
چکیده
Per-pixel classification algorithms are poorly equipped to monitor urban land use i n images acquired by the current generation of high spatial resolution satellite sensors. This is because urban areas commonly comprise a complex spatial assemblage of spectrally distinct land-cover types. In this study, a technique is described that attempts to derive information on urban land use in two stages. The first involves classification of the image into broad land-cover types. In the second stage, referred to as spatial reclassification, the classified pixels are grouped into discrete land-use categories on the basis of both the frequency and the spatial arrangement of the land-cover labels within a square kernel. The application of this technique, known as SPARK ( ~ ~ ~ t i a l Reclassification erne el), is demonstrated using a SPOT-1 HRV m ultispectral image of southeast London, England. Preliminary results indicate that SPARK can be used to distinguish quite subtle differences of land use in urban areas. Introduction While satellite sensor technology has been used with some success to monitor land use in images o f agricultural areas, much less satisfactory results are generally reported for urban scenes (Forster, 1985; Toll, 1985; Barnsley et al., 1989; Sadler and Barnsley, 1990). Initially, this disparity was attributed to the relatively coarse spatial resolution of early Earth-resources sensors, such as the Landsat Multispectral Scanning System (MSS) (Jackson et al., 1980; Forster, 1980). By averaging the spectral response of buildings, roads, trees, grass and other component elements of urban scenes within their large instantaneous field-of-view (IFOV), these sensors tended to produce broad, composite signals for urban areas. Consequently, it was often difficult to distinguish between different categories of urban land use in the resultant images. Moreover, the "blocky" appearance of these images inhibited accurate delineation of the urban-rural boundary. Unfortunately, the use of higher resolution data from the current generation of satellite sensors has not always yielded the improvements anticipated (Toll, 1985; Forster, 1985; Martin et al., 1988). Indeed, some studies report a reduction Remote Sensing Unit, Department of Geography, University College London, 16, Bedford Way, London WClH OAP, United Kingdom. M.J. Barnsley is presently with the Department of Geography, University of Wales Swansea, Singleton Park, Swansea SA2 8PP, United Kingdom. S.L. Barr is presently with the Department of Geography, University of Manchester, Mansfield Cooper Building, Oxford Road, Manchester MI3 9PL, United Kingdom. PE&RS August lYYG in the accuracy with which different urban land uses can be distinguished in such images, relative to that obtained using coarser resolution data (Haack et al., 1987; Martin et al., 1988). This apparently paradoxical phenomenon has been ascribed to the problem of "scene noise" (Gastellu-Etchegorry, 1990; Gong and Howarth, 1990). In other words, as the spatial resolution of the sensor increases, individual scene elements (e.g., buildings, roads, and open spaces) begin to dominate the detected response of each pixel; therefore, the spectral response of urban areas as a whole becomes more varied, making consistent classification of land use problematic (Gastellu-Etchegorry, 1990). Although it is tempting to set this problem in the context of inadequate or inappropriate sensor spatial resolution, it is perhaps more accurately expressed in terms of the limitations of commonly used information extraction techniques; in particular, standard, per-pixel, multispectral classification algorithms. The fundamental problem involved in producing accurate land-use maps of towns and cities from remotely sensed images is that urban areas comprise a complex spatial assemblage of land-cover types, each of which may have different spectral reflectance characteristics (Wharton, 1982a; Wharton, 1982b; Gong and Howarth, 1990; Barnsley et al., 1991; Eyton, 1993). Unfortunately, per-pixel classification algorithms are poorly equipped to deal with this type of spatial variability, because they assign each pixel to one of the candidate classes solely on the basis of its spectral reflectance properties (Woodcock and Strahler, 1987; Barnsley et al., 1991; Barnsley and Barr, 1992). The location of the pixel within the image and the relationship between its spectral response and that of its neighbors are not taken into account. A further problem for supervised, per-pixel classification is that it is extremely difficult to define suitable training sets for many categories of urban land use, due to the variation in the spectral response of their component land-cover types (Forster, 1985; Gong and Howarth, 1990; Barnsley et al., 1991). Thus, the training statistics may exhibit both a multimodal distribution and a large standard deviation in each spectral waveband (Sadler et al., 1991). The implication of the former is that the training statistics for urban areas violate one of the basic assumptions of the widely used maximum-likelihood decision rule, namely, that the pixel values follow a multivariate normal distribution. The effect of the latter is often to produce a pronounced overlap between urban and non-urban land-use categories in the multispectral Photogrammetric Engineering & Remote Sensing, Vol. 62, No. 8, August 1996, pp. 949-958. 0099-1112/96/6208-949$3.00/0 O 1996 American Society for Photogrammetry and Remote Sensing a) Commercial / Industrial b) Residential F ~ = F d i n ; ; T = Tree; G = Grass. ) Figure 1. Simulated 3by 3-pixel windows showing the possible distributions of land cover types for two urban land-use categories. feature space. This may be further compounded by the fact that the mean spectral response for the urban classes will differ from those of the non-urban classes in a somewhat arbitrary and unpredictable manner, depending on the location of the training areas (Barnsley et al., 1991; Sadler et al., 1991). Various attempts have been made to overcome these problems, including The use of pre-classification image transformations and feature-extraction techniques, such as median filters (Atkinson et al., 1985; Sadler et al., 1991) and various measures of image texture (Haralick, 1979; Baraldi and Parmiggiani, 1990; Franklin and Peddle, 1990; Gong and Howarth, 1990; Sadler et al., 1991); The incorporation of spatially referenced, ancillary data into the classification procedure (Forster, 1984; Sadler and Barnsley, 1990; Ehlers et al., 1991; SadIer et al., 1991); The use of enhanced classification algorithms, ranging kom contextual classifiers (Gurney, 1981; Gurney and Townshend, 1983; Gong and Howarth, 1992), through knowledge-based expert systems (Mehldau and Schowengerdt, 1990; MollerJensen, 1990), to artificial neural networks (Hepner et al., 1990; Kanellopoulos et al., 1992; Civco, 1993; Dryer, 1993); and The application of post-classification spatial processing, ranging from simple majority filters to spatial (or contextual) reclassification procedures (Thomas, 1980; Wharton, 1982a; Wharton, 1982b; Gurney and Townshend, 1983; Gong and Howarth, 1990; Whitehouse, 1990; Guo and Moore, 1991; Gong and Howarth, 1992; Wang and Civco, 1992a; Wang and Civco, '1992b; Eyton, 1993). However, not all of these techniques directly address the problem of inferring land use from a complex spatial mixture of spectrally distinct land-cover types. For example, pre-classification spatial filtering attempts to circumvent the problem by suppressing some of the spatial variability within the image. This is achieved only at the expense of a reduction in the effective spatial resolution of the data set. It also produces somewhat arbitrary mean vectors for urban land use categories by aggregating the detected spectral responses of their component land-cover types. Of the other techniques, spatial (or contextual) reclassification represents a comparatively simple way to examine the spatial variation in land cover in remotely sensed images, and is easy to implement in most image processing systems. Spatial reclassification techniques divide the classification process into two stages: the first involves a standard, perpixel classification of the scene; the second involves some form of post-classification spatial processing of these data. Use of this procedure to infer urban land use from the spatial arrangement of land cover was first suggested by Wharton (1982a; 1982b). The assumption underlying this approach is that individual categories of land use have characteristic spatial mixtures of spectrally distinct land cover types that enable their recognition in high spatial resolution images (Wharton, 1982a; Wharton, 1982b; Barnsley and Barr, 1992). For example, residential districts might be characterized by the intermixing of roofs, roads, and gardens. Spatial reclassification can be performed in one of two ways. The k s t , referred to as kernel-based spatial reclassification (Gurney and Townshend, 1983; Barnsley and Barr, 1992), involves passing a simple convolution kernel across the land-cover image. In the second, referred to as objectbased spatial reclassification (Gurney and Townshend, 1983), discrete "objects" (i.e., groups of adjacent pixels with the same class label) are identified within the initial image segmentation: information on the size, shape, and spatial arrangement of these objects is subsequently used to determine the nature of the land use in different parts of the image (Barr, 1992; Barr and Barnsley, 1993; Barnsley et al., 1993). In this paper, we describe a kernel-based procedure, referred to as SPARK ( ~ ~ ~ t i a l ~eclassification ern el). SPARK examines both the frequency and the spatial arrangement of class (land-cover) labels within a square kernel. This technique is tested using a subscene extracted from a SPOT-1 HRV multispectral image of south-east London, England. SPARK: A SPAtial Reclassification Kernel Background The work by Wharton (1982a; 1982b) provides an early example of kernel-based spatial re-classification, in which the initial low-level segmentation of the image is performed using a standard, unsupervised classification algorithm. The frequency of different land-cover types within each nby npixel region is then calculated by convolving a simple, rectangular kernel with the classified image. The land use associated with the pixel at the center of the kernel is derived using an unsupervised, non-parametric clustering procedure applied to these frequency data. Similar techniques have been used more recently by Whitehouse (1990), Guo and Moore (1991), Gong and Howarth (1992), and Eyton (1993); although, in these studies, with the exception of Eyton (1993), the frequency distribution of land-cover types surrounding each pixel is compared with those of known areas of the candidate land-use categories. Although the method developed by Wharton examines the frequency with which different class labels occur within the kernel, it does not account for differences in their spatial arrangement. The limitation that this imposes is evident in the following example. Consider two separate 3by 3-pixel windows, each of which has four pixels labeled as the landcover class "Building." In an industrial or commercial district, where these might represent a single large factory or warehouse, the pixels are likely to be clustered together in a block (Figure la). By contrast, in a residential area, where the same class labels might represent individual houses, the "Building" pixels might be arranged in a line (terraced housing) or might be physically separate (detached housing) (Figure lb). However, a procedure which simply calculates the frequency of different class labels within these windows will have no means of distinguishing these two conditions. The example illustrates the need to find a reliable method for recording both the frequency and the spatial arrangement of class labels within a given section of the image. One way to do this is to record the number of times that different class labels occur next to one another within a pre-defined, movJn vertex Adjacency "L Events Edge I Adjacency Events Figure 2. Adjacency events in a 3by 3-pixel window. there are several important differences between SPARK and these indices. First, unlike the BCM, the number of elements in SPARK'S adjacency-event matrix is independent of the kernel size. Second, SPARK records information on the precise nature of each adjacency event, whereas the BCM simply notes whether adjacent pixels have the same or different class labels. Third, unlike contagion, SPARK examines adjacency between pixels connected vertex-to-vertex, as well as edge-to-edge. Finally, SPARK produces values ranging between 0 and 1, irrespective of the number of classes (cf., contagion). Assigning Pixels to Land-Use Categories Using SPARK The land use category, k, for a given pixel is determined by comparing its adjacency-event matrix, M, with those derived from representative sample areas of the candidate land-use categories; the latter will be referred to as "template" matrices, T,. Note that the sample areas used to generate the ing window. A simple technique to achieve this, referred to template matrices are the same size as the spatial reclassifias the s ~ ~ t i a l Reclassification erne el (SPARK), is described in cation kernel. Multiple template matrices can be defined for this paper. each land use. These may either be used independently or be pooled to produce an "average" template matrix. The advanThe SPAtial Reclassification Kernel (SPARK) tage of using a series of independent templates for a single SPARK operates by examining pairs of adjacent pixels within land use is that subtle variations in the spatial arrangement a square kernel (i.e., those connected along an edge or by a of its constituent land-cover types at different locations vertex), the size of which is selected by the user (Figure 2). within the image can be taken into account. However, it also The class label associated with each pixel defines the nature results in a linear increase in computation time. On the other of the "adjacency event." For example, contiguous pixels lahand, use of pooled or "average" template matrices may rebeled "Building" and "Tree," respectively, produce a Buildsult in overlap between land-use classes in "adjacency ing-Tree adjacency event. Note that each pair of pixels prospace." duces a single adjacency event, so that the order of the labels As the spatial reclassification kernel is passed over the is not significant: i.e., the adjacency events "Tree-Building3' image, the current adjacency-event matrix is compared with and "Building-Tree" are identical. Thus, in Figure l a there each of the template matrices using Equation 2: i.e., are six Building-Building adjacency events, four of BuildingTree, five of Building-Grass, and so on. By comparison, al\i C C though the window in Figure l b contains exactly the same Ak = 1 0.5NZ C (Mjj Tk,j)Z i=1 I = ] (2 J number of pixels belonging to each class, there are only three Building-Building adjacency events, but six of Building-Tree. O < A k < l (3) In practice, SPARK is convolved with the land cover image to produce an adjacency-event matrix, M, for each where MZj is an element of the current adjacency-event mapixel: i.e., trix, Tk,, is the corresponding element of the template matrix for land-use category k, N is the total number of adjacency f i ~ f i ~ f i 3 .om fii events in the kernel (determined by the kernel size, e.g., N = M ( Iz2 : :: j 20 for a 3by 3-pixel kernel, N= 72 for a 5by 5-pixel kernel; recall that a pair of adjacent pixels produces a single adjacency event), and C is the number of land-cover classes in the image. The term A, can be thought of as an index of similarity The value of each element, f;,, of the matrix denotes the frebetween the current adjacency-event matrix and the template quency with which pixels belonging to class i are adjacent to matrix for land-use category k. Thus, a value of 1.0 indicates those belonging to class j, for the current position of the kera perfect match with one of the land-use templates, while a riel. The number of elements in M is determined by the numvalue of 0.0 indicates no match. The pixel at the center of ber of classes, C, in the image and is therefore independent the kernel is, therefore, assigned to the land-use category k of the kernel size. Note that we only consider the upper trifor which A, is maximized. A user-specified threshold can be angular elements, because M, = Yi. For most studies, where Set to Prevent pixels being assigned to a land-use category on the number of land-cover classes is reasonably small, this the basis of a weak match between the measured adjacencyrepresents an efficient means of storing information about the event matrix and a land-use spatial arrangement of the land-cover types within the image. shows the adjacency-event matrices for the TABLE 1. ADJACENCY-EVENT MATRICES FOR SIMULATED 3BY P PIXEL WINDOWS 3by 3-pixel windows presented in Figure 1. SHOWN IN FIGURES 1 A AND l B , RESPECTIVELY. The adjacency-event matrix, M, described above, is similar in some respects to the spatial-dependency (or co-occurrence) matrix devised by Haralick (1979), though here we deal with class labels rather than with raw digital numbers (DN). It is also closely related to several of the measures of spatial variability used in landscape ecology, notably "Contagion" (Robinove, 1986; Turner, 1989) and the Binary Comparison Matrix (BCM) developed by Murphy (1985). However, PE&RS August 1996 951
منابع مشابه
Determining Effective Factors on Land Surface Temperature of Tehran Using LANDSAT Images And Integrating Geographically Weighted Regression With Genetic Algorithm
Due to urbanization and changes in the urban thermal environment and since the land surface temperature (LST) in urban areas are a few degrees higher than in surrounding non-urbanized areas, identifying spatial factors affecting on LST in urban areas is very important. Hence, by identifying these factors, preventing this phenomenon become possible using general education, inserting rules and al...
متن کاملValidation of Volunteered Geographic Information Landuse Change Using Satellite Imagery
Land use change monitoring is one of the main concerns of managers and urban planners due to human activities and unbalanced physical development in urban areas. In this paper, a combination of remote sensing data and volunteered geographic information was used to assess the quality of volunteered geographic information on land use and land cover changes monitoring. For this purpose, the ORBVIE...
متن کاملEvaluation of Land Use Change Trends Using Landscape Measurements (Case Study: Pakdasht City)
Today, urban and rural planning and management programs need to obtain accurate spatial information at successive times about land use changes. The main purpose of this study is to study and evaluate land use changes due to physical development with respect to 4 land uses in Bayer, agricultural lands, water zones and man-made lands in Pakdasht. Data were collected through Landsat satellite imag...
متن کاملDeveloping Spatial Re-classification Techniques for Improved Land-use Monitoring Using High Spatial Resolution Images
The reasons for the poor performance of conventional, per-pixel classification algorithms applied to satellite sensor images of urban areas are examined. It is argued that standard algorithms are poorly adapted to distinguish between different urban land-use categories, particularly in high spatial resolution images, due to the complex spatial pattern of spectrally distinct land-cover types in ...
متن کاملComparing the Capability of Sentinel 2 and Landsat 8 Satellite Imagery in Land Use and Land Cover Mapping Using Pixel-based and Object-based Classification Methods
Introduction: Having accurate and up-to-date information on the status of land use and land cover change is a key point to protecting natural resources, sustainable agriculture management and urban development. Preparing the land cover and land use maps with traditional methods is usually time and cost consuming. Nowadays satellite imagery provides the possibility to prepare these maps in less ...
متن کامل